(Chore): Better telemetry for quarantine query#1100
Conversation
| } | ||
|
|
||
| #[test] | ||
| fn test_quarantine_query_result_mapper() { |
There was a problem hiding this comment.
unit test for this
| pre_test_context: Option<PreTestContext>, | ||
| test_run_result: Option<TestRunResult>, | ||
| render_sender: Option<Sender<DisplayMessage>>, | ||
| quarantine_query_result_override: Option<proto::upload_metrics::trunk::QuarantineQueryResult>, |
There was a problem hiding this comment.
RSpec calls the quarantine config at a separate time than the upload flow. There's a gate above this where we conditionally call GetQuarantineConfig depending on if there are JUnits (which there aren't in RSpec). That's a smell, but I didn't want to undertake that refactor right now, and it's not a meaningful increase in burden to wire in a quarantine query result override here
| } | ||
|
|
||
| #[tokio::test(flavor = "multi_thread")] | ||
| async fn telemetry_upload_metrics_quarantine_query_result_success() { |
There was a problem hiding this comment.
These CLI integration tests require a bit more custom setup because of the scenarios they test
| enum QuarantineQueryResult { | ||
| QUARANTINE_QUERY_RESULT_UNSPECIFIED = 0; | ||
| QUARANTINE_QUERY_RESULT_SUCCESS = 1; | ||
| QUARANTINE_QUERY_RESULT_FAILURE = 2; | ||
| QUARANTINE_QUERY_RESULT_DISABLED = 3; | ||
| QUARANTINE_QUERY_RESULT_SKIPPED = 4; | ||
| // rspec only | ||
| QUARANTINE_QUERY_RESULT_CACHED = 5; | ||
| } |
There was a problem hiding this comment.
- Unspecified: default, shouldn't be sent from new CLI versions in practice
- Success: we successfully got quarantined results back
- Failure: something went wrong. This could also be an auth error (403/404), but we don't differentiate rn here. Realistically we want to track both
- Disabled: either disabled via the command line/env var, or via the repo setting (from our db)
- Skipped: no tests actually failed, no need to look up quarantine config
- Cached: rspec only, quarantined results served from file on disk
There was a problem hiding this comment.
Worth making any of those points an inline comment alongside any of those?
There was a problem hiding this comment.
agreed, I think those comments are nice :)
There was a problem hiding this comment.
Fair enough. Will do
| } | ||
|
|
||
| #[derive(Debug, Clone, Copy, PartialEq, Eq)] | ||
| enum QuarantineLookupSource { |
There was a problem hiding this comment.
Special enum for the rspec/report case
There was a problem hiding this comment.
"Integration" tests underlying the rspec flow. Meaningfully distinct from the CLI integration tests above
| } | ||
| } | ||
|
|
||
| fn publish_minimal_success_test(test_report: &MutTestReport) { |
There was a problem hiding this comment.
This helper keeps things simple for most of the tests below. Note that test_report.publish() is what calls upload and the telemetry call
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## tyler/reorganize-report-tests #1100 +/- ##
=================================================================
+ Coverage 82.15% 82.28% +0.12%
=================================================================
Files 69 69
Lines 15039 15139 +100
=================================================================
+ Hits 12356 12457 +101
+ Misses 2683 2682 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
40ac68a to
5e8f731
Compare
| enum QuarantineQueryResult { | ||
| QUARANTINE_QUERY_RESULT_UNSPECIFIED = 0; | ||
| QUARANTINE_QUERY_RESULT_SUCCESS = 1; | ||
| QUARANTINE_QUERY_RESULT_FAILURE = 2; | ||
| QUARANTINE_QUERY_RESULT_DISABLED = 3; | ||
| QUARANTINE_QUERY_RESULT_SKIPPED = 4; | ||
| // rspec only | ||
| QUARANTINE_QUERY_RESULT_CACHED = 5; | ||
| } |
There was a problem hiding this comment.
agreed, I think those comments are nice :)
| env::set_var(TRUNK_QUARANTINED_TESTS_DISK_CACHE_TTL_SECS_ENV, "0"); | ||
| } | ||
|
|
||
| thread::spawn(|| { |
There was a problem hiding this comment.
is there a specific reason to do this in a separate thread? same goes for the other tests
There was a problem hiding this comment.
According to Cursor, our publish call in rspec does:
tokio::runtime::Builder::new_multi_thread()
.build()
.unwrap()
.block_on(run_upload(...))Without the thread spawn we get
thread 'telemetry_query_result_skipped_without_lookup_on_publish' (3635475) panicked at /home/tyler/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/tokio-1.48.0/src/runtime/scheduler/multi_thread/mod.rs:86:9:
Cannot start a runtime from within a runtime. This happens because a function (like `block_on`) attempted to block the current thread while the thread is being used to drive asynchronous tasks.
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
test telemetry_query_result_skipped_without_lookup_on_publish ... FAILEDIn summary:
MutTestReport::publish() (and is_quarantined()) create their own tokio runtime internally via Runtime::new().block_on(...). Our tests use #[tokio::test] to run the mock server async, so calling MutTestReport directly inside the test body nests runtimes on the same thread.
thread::spawn puts MutTestReport on a separate OS thread with no existing runtime, which matches how RSpec/Ruby calls it synchronously. Same pattern as the other test_report integration tests (quarantine.rs, publish.rs).
I presume that means we're doing something wrong in the report/rspec side, but not sure
There was a problem hiding this comment.
It's kind of confusing, but I see why. I think it's because we can't do anything async due to Rust<->Ruby interop so we use a blocking Tokio executor nested in a function. And you cannot use a Tokio executor within another Tokio executor.
Call from Ruby:
Place that blocks in Rust:
analytics-cli/test_report/src/report.rs
Lines 506 to 510 in 8e5ffb4
Don't feel like you need to make any code changes for this, but if I were to attempt to improve this, I would:
- Convert everything internal to Rust to use
async - Expose a Ruby function which only calls the internal function and does the blocking
This way we could create a test which utilizes the async internal function directly and not have confusing multi-threaded tests :)
Stacked on #1099. Adds missing observability for the result of querying quarantined tests.
Inputs to quarantine query status:
